Assessing used content across five digital health information services using transaction log files

نویسندگان

  • David Nicholas
  • Paul Huntington
  • Janet Homewood
چکیده

A digital service, like a web site, may contain a lot of information but we often do not know if it is used, relevant or valuable. Transaction log files generated by digital information services do record the pages (topics or content) viewed by users and this is perhaps the most interesting aspect of the logs. However, analysing these pages poses plenty of problems for researchers, especially when comparing content coverage of various related services. It is quite normal, even for digital services of the same organization, to adopt different page naming conventions for each service. This is even truer about digital services run by different organizations. What all this means is that there is no easy way to compare topic use as revealed by access behaviour. This paper looks at the problems of describing and comparing the content usage of digital information services, covering three digital platforms operating in the health field. This paper discusses problems posed in making health content comparisons based on page names listed in the transaction log files and between very large data sets. It reviews the impact that system architecture might have as well as the time the service has been available online and the impact due to outlet differences. However, the main focus of the article is a comparison of five sources of health information through their log files. It makes use of cluster analysis and applies procedures normally used to define species diversity to research content coverage. In all, two million page views were analysed, covering more than 5000 unique health pages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Web intelligence analyses of digital libraries: A case study of the National electronic Library for Health (NeLH)

Purpose – To explore the use of LexiURL as a Web intelligence tool for collecting and analysing links to digital libraries, focusing specifically on the National electronic Library for Health (NeLH). Design/methodology/approach – The Web intelligence techniques in this study are a combination of link analysis (web structure mining), web server log file analysis (web usage mining), and text anal...

متن کامل

A Search Log-Based Approach to Evaluation

Anyone offering content in a digital library is naturally interested in assessing its performance: how well does my system meet the users’ information needs? Standard evaluation benchmarks have been developed in information retrieval that can be used to test retrieval effectiveness. However, these generic benchmarks focus on a single document genre, language, media-type, and searcher stereotype...

متن کامل

Log File Analysis and Mining

Information retrieval is no longer just about matching the content of queries to the content of documents. For nearly two decades, links and link structure have been brought to bear on the information retrieval problem in a web setting. During the past five years, as part of the development of information retrieval algorithms, content and link analysis are increasingly being complemented with i...

متن کامل

Using Data Dependencies to Support the Recovery of Concurrent Processes in a Service Composition Environment

This paper presents an approach for assessing data dependencies among concurrently executing processes in a service composition environment. The data dependencies are analyzed from data changes that are extracted from database transaction log files and generated as a stream of deltas from Delta-Enabled Grid Services. The deltas are merged by timestamp to create a global schedule of data changes...

متن کامل

How to improve the sustainability of digital libraries and information Services?

Arguing that environmental sustainability is a growing concern for digital information systems and services, this article proposes a simple method for estimation of the energy and environmental costs of digital libraries and information services. It is shown that several factors contribute to the overall energy and environmental costs of information and communication technology (ICT) in general...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • J. Information Science

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2003